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Abstract 

Majumder, Reif and Sahu have presented a stochastic model of re- 
versible, error-permitting, two-dimensional tile self-assembly, and showed 
that restricted classes of tile assembly systems achieved equilibrium in 
(expected) polynomial time. One open question they asked was how 
much computational power would be added if the model permitted multi- 
ple nucleation, i.e., independent groups of tiles growing before attaching 
to the original seed assembly. This paper provides a partial answer, by 
proving that if a tile assembly model uses only local binding rules, then 
it cannot use multiple nucleation on a surface to solve certain "simple" 
problems in constant time (time independent of the size of the surface). 
Moreover, this time bound applies to macroscale robotic systems that 
assemble in a three-dimensional grid, not just to tile assembly systems 
on a two-dimensional surface. The proof technique defines a new model 
of distributed computing that simulates tile (and robotic) self-assembly. 
Keywords: self-assembly, multiple nucleation, locally checkable labeling. 

1 Introduction 
1.1 Overview 

Nature is replete with examples of the self-assembly of individual parts into a 
more complex whole, such as the development from zygote to fetus, or, more sim- 
ply, the replication of DNA itself. In his Ph.D. thesis in 1998, Winfree proposed 
a formal mathematical model to reason algorithmically about processes of self- 
assembly [ni]. Winfree connected the experimental work of Seeman [TB] (who 
had built "DNA tiles," molecules with unmatched DNA base pairs protruding in 
four directions, so they could be approximated by squares with different "glues" 
on each side) to a notion of tiling the integer plane developed by Wang in the 
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1960s [20]. Rothemund, in his own Ph.D. thesis, extended Winfree's original 
Tile Assembly Model fTT. 

Informally speaking, Winfree effectivized Wang tiling, by requiring a tiling 
of the plane to start with an individual seed tile or a connected, finite seed 
assembly. Tiles would then accrete one at a time to the seed assembly, growing 
a seed supertile. A tile assembly system is a finite set of tile types. Tile types 
are characterized by the names of the "glues" they carry on each of their four 
sides, and the binding strength each glue can exert. We assume that when the 
tiles interact "in solution," there are infinitely many tiles of each tile type. Tile 
assembly proceeds in discrete stages. At each stage s, from all possibilities of 
tile attachment at all possible locations (as determined by the glues of the tile 
types and the binding requirements of the system overall), one tile will bind, 
with tile type and location "chosen" nondeterministically from possible legal 
bonds at that stage. (Later, we will generalize this so multiple tiles can bind 
concurrently, at a given stage.) Winfree proved that his Tile Assembly Model 
is Turing universal. 

The abstract Tile Assembly Model (aTAM) is error-free and irreversible — 
tiles always bind correctly, and, once a tile binds, it can never unbind. Adleman 
et al. were the first to define a notion of time complexity for tile assembly, using a 
one-dimensional error-permitting, reversible model, where tiles would assemble 
in a line with some error probability, then be scrambled, and fall back to the 
line [T]. Adleman et al. proved bounds on how long it would take such models to 
achieve equilibrium. Majumder, Reif and Sahu have recently presented a two- 
dimensional stochastic model for self-assembly [11], and have shown that some 
tiling problems in their model correspond to rapidly mixing Markov chains — 
Markov chains that reach stationary distribution in time polynomial in the state 
space of legally reachable assemblies. 

While the aTAM is nondeterministic, real-world chemical reactions are prob- 
abilistic, and discrete molecular interactions are often modeled stochastically. 
We will define a class of stochastic self-assembly models that contains the model 
of Majumder et al., and prove a lower bound about any model in that class. 

The tile assembly systems analyzed in [TT] had the property that their equi- 
librium assemblies were identical (allowing for small error) with their terminal 
or complete assemblies, i.e., assemblies that cannot legally evolve further, given 
the rules of the system. This identity does not, however, hold in general. In a 
closed chemical system, where equilibrium may be achieved, it is possible that 
the system at equilibrium might consist almost entirely of large, undesirable 
assemblies that do not perform the desired computation. In these cases, correct 
assembly occurs when the system is out of equilibrium, and can be maintained 
because there is a large kinetic energy barrier to forming undesired structures. 
Therefore, when we discuss the "solution to a problem" in this paper, we identify 
that with the notion of a complete assembly. 

We will prove a time complexity lower bound on the solution of a graph 
coloring problem for a class of self-assembly models, including, but not limited 
to, a generalization of the model of [11]. The tile assembly model in [TT], like 
the aTAM, allows only for a single seed assembly, and one of the open problems 
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in [TT] was how the model might change if it allowed multiple nucleation, i.e., 
if multiple supertiles could build independently before attaching to a growing 
seed supertile. The main result of this paper provides a time complexity lower 
bound for a class of tile assembly models that permit multiple nucleation on a 2D 
surface or a 3D grid: there is no way for those models to use multiple nucleation 
to achieve a speedup to tiling a surface in constant time (time independent of 
the size of the surface) in order to solve a graph coloring problem, even though 
that graph coloring problem requires only seven tile types to solve in the aTAM. 
This result holds for tile assembly models that are reversible, irreversible, error- 
permitting or error- free. In fact, a speedup to constant time is impossible, even 
if we relax the model to allow that, at each step s, there is a positive probability 
for every available location that a tile will bind there (instead of requiring that 
exactly one tile bind per stage). 

To our knowledge, the method of proof in this paper is novel: given a tile 
assembly model and a tile assembly system T in that model, we construct a 
distributed network of processors that can simulate the behavior of T as it as- 
sembles on a surface. Our result then follows from the theorem by Naor and 
Stockmeyer that locally checkable labeling (LCL) problems have no local solu- 
tion in constant time |12l . This is true for both deterministic and randomized 
algorithms, so no constant-time tile assembly system exists that solves an LCL 
problem with a positive probability of success. We consider one LCL problem in 
specific, the weak c-coloring problem, and demonstrate a tile set of only seven 
tile types that solves the weak c-coloring problem in the abstract Tile Assembly 
Model, even though that same problem is impossible to solve in constant time 
by multiple nucleation on a surface, for a broad class of self-assembly models. 
Intuitively, this demonstrates that even a problem that can be solved in poly- 
nomial time by using a few local rules when starting from a single point, cannot 
necessarily be solved in constant time when starting from multiple points, re- 
gardless of the rule set used. (The abstract Tile Assembly Model can weakly 
c-color an n X n surface in steps, yet none of the multiple nucleation models 
we consider can solve the weak c-coloring problem in constant-many steps.) 

The results of Naor and Stockmeyer we apply are more powerful than needed 
to obtain the time complexity lower bound for a system in which the self- 
assembling agents are as simple as DNA tiles. Our lower bound actually demon- 
strates that constant-time speedup to solve LCL problems is impossible via 
multiple nucleation, even for self-assembling modular robots capable of forming 
physical bonds in a three-dimensional grid, and, in addition, of sending mes- 
sages to their neighbors once they have bonded, and potentially deciding to 
break bonds they previously formed. 

1.2 Background 

In the abstract Tile Assembly Model, one tile is added per stage, so the primary 
complexity measure is not one of time, but of how much information a tile set 
needs in order to solve a particular problem. Several researchers [T] [5] |3j [TS] [T7] 
have investigated the tile complexity (the minimum number of distinct tile types 
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required for assembly) of finite shapes, and sets of "scale-equivalent" shapes 
(essentially a Z x Z analogue of the Euclidean notion of similar figures). For 
example, it is now known that the number of tile types required to assemble a 
square of size nx n (for n any natural number) is ri(log n/ log log n) |15| . Or, 
if T is the set of all discrete equilateral triangles, the asymptotically optimal 
relationship between triangle size and number of tiles required to assemble that 
triangle, is closely related to the Kolmogorov Complexity of a program that 
outputs the triangle as a list of coordinates [17 . 

Despite these advances in understanding of the complexity of assembling 
finite, bounded shapes, the self-assembly of infinite structures is not as well 
understood. In particular, there are few lower bounds or impossibility results on 
what infinite structures can be self-assembled in the Tile Assembly Model. The 
first such impossibility result appeared in |10j . when Lathrop, Lutz and Summers 
showed that no finite tile set can assemble the discrete Sierpinski Triangle by 
placing a tile only on the coordinates of the shape itself. (By contrast, Winfree 
had shown that just seven tile types are required to tile the first quadrant of the 
integer plane with tiles of one color on the coordinates of the discrete Sierpinski 
Triangle, and tiles of another color on the coordinates of the complement [H].) 
Recently, Patitz and Summers have extended this initial impossibility result 
to other discrete fractals [13], and Lathrop et al. [5] have demonstrated sets in 
Z X Z that are Turing decidable but cannot be self-assembled in Winfree's sense. 

To date, there has been no work comparing the strengths of different tile 
assembly models with respect to infinite (nor to finite but arbitrarily large) 
structures. Since self-assembly is a process in which each point has only lo- 
cal knowledge, it is natural to consider whether the techniques of distributed 
computing might be useful for comparing models of self-assembly and prov- 
ing impossibility results about them. This paper is an initial attempt in that 
direction. 

Aggarwal et al. in |3j proposed a generalization of the standard Tile Assem- 
bly Model, which they called the g-Tile Assembly Model. This model permitted 
multiple nucleation: tiles did not need to bind immediately to the seed supertile. 
Instead, they could form independent supertiles of size up to some constant q 
before then attaching to the seed supertile. While the main question considered 
in [3] was tile complexity, we can also ask whether multiple nucleation would 
allow an improvement in time complexity. Intuitively, Does starting from mul- 
tiple points allow us to build things strictly faster than starting from a single 
point? 

As mentioned above, Majumder, Reif and Sahu recently presented a stochas- 
tic, error-permitting tile assembly model, and calculated the rate of convergence 
to equilibrium for several tile assembly systems The model in pjD permit- 
ted only a single seed assembly, and addition of one tile to the seed supertile at 
each stage. Majumder, Reif and Sahu left as an open question how the model 
might be extended to permit the presence and binding of multiple supertiles. 

Therefore, we can rephrase the "intuitive" question above as follows: Can we 
tile a surface of size n x n in a constant number of stages, by randomly selecting 
nucleation points on the surface, building supertiles of size q or smaller from 
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those points hi < q stages, and then allowing < r additional stages for tiles 
to fall off and be replaced if the edges of the supertiles contain tiles that bind 
incorrectly? (The assembly achieves equilibrium in constant time because q 
and r do not depend on n.) The partial answer obtained in this paper is that 
locally checkable labeling problems cannot be solved in constant time, if we limit 
ourselves to self-assembly on a surface. 

Limiting ourselves to self-assembly on a surface is significant, because we 
are requiring that agents adhere to a substrate and then never move again, 
unless they dissociate completely from the larger assembly. When assemblies 
multiply nucleate in solution, however, they form disjoint supertiles that can 
float independently until potentially becoming aligned, with some probability. 
A self-assembly model that made this rigorous might be strictly stronger than 
the self-assembly models we consider in this paper, as it is not clear how to 
simulate floating supertiles within our distributed computing models without 
introducing slowdown, as processors simulating locations of the surface would 
have to "pass along" information from one processor to the next, to simulate 
elements of the moving supertile. We leave the possibility of simulating floating 
supertiles to future work. 

Another limitation to our results is that our proof technique applies only 
to self-assembly models whose binding rules are completely local. One could 
imagine models in which supertiles combine (or separate) based on simultaneous 
interactions at several locations, instead of the models we consider in this paper, 
in which the system's behavior at each location depends only on the properties 
of that location's immediate neighbors. The self-assembly literature, to our 
knowledge, contains little regarding self-assembly models with nonlocal binding 
rules, and this could be a fruitful area to investigate. 

Klavins and co-authors have modeled self-assembly phenomena — and pro- 
grammed self-assembling modular robots — using graph grammars [B] [S] . Klavins 
in [7j informally compares the limitations of the "distributed algorithms" of 
graph grammars (used to program self-assembling robots) to impossibility re- 
sults in distributed computing. Recently, we have shown connections between 
self-assembly and the wait- free consensus hierarchy ^18j , and we have embedded 
the "graph assembly systems" of Klavins into a known graph grammar char- 
acterization of distributed systems [I9j. The present paper, to the best of our 
knowledge, is the first to construct a formal reduction from self-assembly models 
to models of distributed computing. 

Section 2 of this paper describes the abstract Tile Assembly Model, and then 
considers generalizations of the standard model that permit multiple nucleation. 
Section 3 reviews the distributed computing results of Naor and Stockmeyer 
needed to prove the impossibility result. In Section 4 we present our simulation 
technique and lower bound results. Section 5 concludes the paper and suggests 
directions for future research. 
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The north side has glue type 
"YO" and binding strength 2, 
represented by a double line. 



The west side has 
binding strength 0, 
represented by a 
dashed line. 




The south side has 
glue type "Yl" and 
binding strength 2. 



The east side has glue 
type "0" and binding 
strength 1 , represented 
by a single line. 



This tile is 
named "Yl". 



Figure 1: An example tile with explanation. 

2 Description of Self- Assembly Models 
2.1 The abstract Tile Assembly Model 

Winfree's objective in defining the Tile Assembly Model was to provide a use- 
ful mathematical abstraction of DNA tiles combining in solution [2T]. Rothe- 
mund |14j . and Rothemund and Winfree |15j . extended the original definition 
of the model. For a comprehensive introduction to tile assembly, we refer the 
reader to [l^. In our presentation here, we follow [10], which gives equal status 
to finite and infinite tile assemblies. 

Intuitively, a tile of type t is a unit square that can be placed with its center 
on a point in the integer lattice. A tile has a unique orientation; it can be 
translated, but not rotated. We identify the side of a tile with the direction (or 
unit vector) one must travel from the center to cross that side. The literature 
often refers to west, north, east and south sides, starting at the leftmost side 
and proceeding clockwise. Each side of a tile is covered with a "glue" that has 
a color and a strength. Figure 1 shows how a tile is represented graphically. 

If tiles of types t and t' are placed adjacent to each other, then they will 
bind with the strength shared by both adjacent sides if the glues on those sides 
are the same. Note that this definition of binding implies that if the glues of 
the adjacent sides do not have the same color or strength, then their binding 
strength is 0. Later, we will permit pairs of glues to have negative binding 
strength, to model error occurrence and correction. 

One parameter in a tile assembly model is the minimum binding strength 
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required for tiles to bind "stably." This parameter is usually termed temperature 
and denoted by r, where r S N. 

As we consider only two-dimensional tile assemblies, we limit ourselves to 
working in = Z x Z. C/2 is the set of all unit vectors in Z^. 

A binding function on an (undirected) graph G = {V, E) is a function j3 : 
E — > N. If /? is a binding function on a graph G = (V, E) and C = (Co, Ci) is 
a cut of G, then the binding strength of /3 on C is 

/3c = {/3(e) \ e&E,e&Co, and e e Ci} . 

The binding strength of /3 on G is then /?(G) = min{/3c | C is a cut of G}. In- 
tuitively, the binding function captures the strength with which any two neigh- 
bors are bound together, and the binding strength of the graph is the minimum 
strength of bonds that would have to be severed in order to separate the graph 

into two pieces. 

A binding graph is an ordered triple G = {y,E,j3) where {V, E) is a graph 
and /3 is a binding function on {V, E). If r e N, a binding graph G = {V, E,P) 
is T-stable if P{{V,E)) > r. 

Recall that a grid graph is a graph G = {V, E) where y C Z x Z and every 
edge {m,li} e E has the property that m — li G U2- We write [V]"^ for the 
set {{vi,V2} \ vi gV and V2 S V}, i.e., the two-element subsets of V. 

Definition 1. A tile type over a (finite) alphabet E is a function t : U2 — > 
S* X N. We write t = (colt,stri), where colt ■ U2 — > S*, and str^ : U2 — *■ N 
are defined by tilt) = (colt(~u ),strt(~u )) for all li G 1/2- 

Definition 2. If T is a set of tile types, a T-configuration is a partial function 
a : I? T. 

Definition 3. The binding graph of a T-configuration a : Z^ — -> T is the 

binding graph Ga = {V,E,(3), where {V,E) is the grid graph given by 

V = dom(a), 

E = {{m, "n} e [V]^ I m — "n e f72, colQ,(^)("n — m) = coljj(^)(m — "n), and 
strQ;(7f|)(ri' - m) > O}, 

and the binding function /3 : E — > Z+ is given by /3({m, "n }) = str^^^jjjCn — m) 
for all {m,"n} S E. 

Definition 4. For T a set of tile types, a T-configuration a is stable if its 
binding graph Ga is T-stable. A r-T-assembly is a T-configuration that is r- 
stable. We write Aj- for the set of all t-T- assemblies. 

Definition 5. Let a and a' be T- configurations. 

1. a is a subconfiguration of a! , and we write a C a' , i/dom(Q;) C dom(Q;') 
and, for all m G dom(a), a(m) = a'{m). 
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2. q! is a single-tile extension of a if a Q a' and dom(Q!') \ dom(Q;) is a 
singleton set. In this case, we write a' = a + {m i-^ t), where {m} = 
dom(a') \ dom(a) and t = a'{rn). 

3. The notation a --^ a' means that a, a' G A'^r and a' is a sinqle-tile 

extension of a. (The "1" above the arrow is to denote that a single tile is 
added at this step.) 

Definition 6. Let a € A^. 

1. For each t gT, the r-f-frontier of a is the set 

= |m € Z^\dom(Q)| strt(l?)-|a(m+^)(— li) = > r| . 

'ueU2 

2. The T-frontier of a is the set 

d'-a = y dja . 

teT 

Definition 7. A r-T-assembly sequence is a sequence "a = (q, \ < i < k) in 
AJr, where k gZ'^ L) joo} and, for each i with 1 < i + 1 < fc, Qj — ^ ccj+i. 

T,T 

Definition 8. The result of a r-T-assembly sequence "a = {ui \ < i < k) 

is the unique T -configuration a = res(^) satisfying: dom(a) = Uo<i<fedom(aj) 

and ai Q a for each < i < k. 

Definition 9. Let a, a' G A^. A r-T-assembly sequence from a to a' is a 
T-T -assembly sequence "a = (a^ | < i < fc) such that aQ = a and res ("a) = a' . 
We write a —^^ a! to indicate that there exists a r-T-assembly from a to a! . 

Definition 10. An assembly a E yl^ is terminal if d^a = 0. 

Intuitively, a configuration is a set of tiles that have been placed in the plane, 
and the configuration is stable if the binding strength at every possible cut is at 
least as high as the temperature of the system. Informally, an assembly sequence 
is a sequence of single-tile additions to the frontier of the assembly constructed 
at the previous stage. Assembly sequences can be finite or infinite in length. 
We are now ready to present a definition of a tile assembly system. 

Definition 11. Write A^ for the set of confi,gurations, stable at tem,perature 
r, of tiles whose tile types are in T. A tile assembly system is an ordered triple 
T = (T, a, r) where T is a finite set of tile types, a e A^ is the seed assembly, 
and r is the temperature. We require dom(0-) to be finite. 

Definition 12. Let T = (T, a, r) be a tile assembly system. 
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1. Then the set of assemblies produced by T is 

A[T] = {a e AT\a a] , 

where "a — *■ a " means that tile configuration a can he obtained from seed 
assembly a by a legal addition of tiles. 

2. The set of terminal assemblies produced by T is 

A\z\\T] — {a ^ I OL is terminal] , 

where "terminal" describes a configuration to which no tiles can be legally 
added. 

If we view tile assembly as the programming of matter, the following analogy 
is useful: the seed assembly is the input to the computation; the addition of 
tile types to the growing assembly are the legal steps the computation can take; 
the temperature is the primary inference rule of the system; and the terminal 
assemblies are the possible outputs. 

We are, of course, interested in being able to prove that a certain tile assem- 
bly system always achieves a certain output. In [17], Soloveichik and Winfree 
presented a strong technique for this: local determinism. 

Informally, an assembly sequence "a is locally deterministic if (1) each tile 
added in a binds with the minimum strength required for binding; (2) if there 
is a tile of type at location m in the result of a, and to and the immediate 
"OUT-neighbors" of t^ are deleted from the result of a, then no other tile type 
in T can legally bind at to; the result of a is terminal. We formalize these 
points as follows. 

Definition 13 (Soloveichik and Winfree [T7]). A t-T -assembly sequence Ix = 
(cti \ < i < k) with result a is locally deterministic if it has the following three 
properties. 

1. For all m £ dom{a) ~ dom(aQ), 

l?eIN"° (m) 

where IN" (to) means the sides of the tile that bound at location m during 
assembly sequence ~a that contributed nonzero strength during the stage at 
which the tile bound. (Informally, these are the "input sides" of the tile 
at location to, with respect to assembly sequence ~a .) 

2. For all Tn G dom{a) — dom{ao) and all t £ T — {a{rn)}, to ^ dj {~a\m) . 

3. 9^a = 0. 

Definition 14 (Soloveichik and Winfree [T7]). A tile assembly system T is 
locally deterministic if there exists a locally deterministic t-T -assembly sequence 
a — {ai I < z < fc) with = a . 
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Local determinism is important because of the following result. 

Theorem 1 (Soloveichik and Winfree [T7j). If T is locally deterministic, then 
T has a unique terminal assembly. 

2.2 More general self-assembly models 

We move now from DNA tiles self- assembling on a two-dimensional surface, to a 
more general setting, where self-assembling "agents" with the ability not just to 
bind but also to communicate after binding and potentially unbind, can assemble 
either in the plane or in three-space. One could think of think of these agents as 
(nano- or macroscale) robots that interlock physically, and, after interlocking, 
can send their neighbors electronic messages of low complexity. Based on receipt 
of messages, the robots can then decide to break bonds to one or more of their 
neighbors. Such modular robots have already been implemented in laboratory 
experiments [7]. Further, these robots may be constructed so each has (at least 
with high probability) a unique identification code — permitting transmission of 
strictly more information than is possible in the setting of tile self-assembly, in 
which tiles do not have unique identifiers. 

We will consider generalizations of the abstract Tile Assembly Model that 
include the following: (1) multiple nucleation; (2) assembly in which glues bind 
incorrectly according to some error probability; and (3) negative glue strengths, 
allowing incorrectly bound tiles to be released from the assembly so it is pos- 
sible for a correctly-binding tile to attach in that location; (4) a third spatial 
dimension; and (5) tiles can now be "agents," i.e., finite state machines with 
algorithms and unique identifiers. We formalize this as follows. 

Definition 15. A d-regular self- assembling agent type T is a finite state ma- 
chine of form T ~ {A, (gi, . . . ,gd)), where A is an (deterministic or probabilis- 
tic) algorithm and the gi 's are finite strings over a finite alphabet (codes for the 
glue types associated with T) that are hardcoded into the machine. The algorithm 
A can be null (in the case of passive self-assembly like DNA tiles), or can decide 
whether to transmit messages of length bounded by a constant to neighbors based 
on the agent's interaction with neighboring glue types. 

We will assume that all agent types have identical geometric structure, and 
their d glues all have the same orientation. For example, in the aTAM, all agent 
types are unit squares, oriented north, east, south, west. Also, for simplicity of 
the proof, we will assume that our agents are memoryless. However, because of 
the generality of the results of Naor and Stockmeyer, our lower bound results 
would still hold if agents could make active self-assembly decisions based on a 
history of messages received from neighbors, not just the current messages and 
glue types of their neighbors. 

Definition 16. The binary relation R is a set of binding rules for the (finite) 
set of agent types {Ti}i if, for any {x,y) € R, both x and y are glue types that 
appear in elements of {Ti}i. 
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Definition 17. TTie function (3 is an assignment of binding strengths for the 
set of binding rules R, if the domain of (3 is R, and the range of (3 is the set of 
nonnegative integers. 

Definition 18. A4 is a model of d-regular self-assembling agents if A4 = 
R, P,T,a) , where {Ti}i is a (finite) set of d-regular self- assembling agent 
types, R is a set of binding rules for {Ti\i, (3 is an assignment of binding 
strengths for R, t is the temperature of the system ( the threshold binding strength 
for bonds to be stable), and a is an initial (finite) seed assembly. 

The algorithm of each agent type may include a variable MY-ID, and we 
allow the possibility that each agent in the system does, in fact, have a unique 
identification number. This might be appropriate when modeling robotic self- 
assembly. In the case of molecular self-assembly, each agent is anonymous. 
Assembly systems in both the aTAM and the stochastic model of Majumder 
et al. can be defined in this formalism, by giving each agent type an algo- 
rithm that performs no instructions, and defining (respectively deterministic or 
probabilistic) binding relations in a natural way. 

To conclude this section, we formalize what it means for a self-assembly 
model to allow multiple nucleation on a surface. 

Definition 19. Let M be a model of d-regular self-assembling agents. We 
say M. allows multiple nucleation if, in addition to the placement of the seed 
assembly at the initial stage of assembly, there is some probability tTj^ that (at the 
first stage of assembly only) an agent is placed on each location of the surface 
with probability n^. Further, if an agent is placed at location m because of 
multiple nucleation, its agent type is chosen uniformly at random from the space 
of possible agent types. 

We could allow multiple nucleation to occur at multiple stages during the 
assembly, not just the first. Again, because of the generality of Naor and Stock- 
meyer's results, that would not affect our lower bound proof. 

3 Distributed Computing Results of Naor and 
Stockmeyer 

In a well known distributed computing paper, Naor and Stockmeyer investigated 
whether "locally checkable labeling" problems could be solved over a network of 
processors in an entirely local manner, where a local solution means a solution 
arrived at "within time (or distance) independent of the size of the network" |12j . 
One locally checkable labeling problem Naor and Stockmeyer considered was the 
weak c-coloring problem. 

Definition 20 (Naor and Stockmeyer [12 ). For c E N, a weak c-coloring of a 
graph is an assignment of numbers from {1, . . . , c} (the possible "colors") to the 
vertices of the graph such that for every non-isolated vertex v there is at least 
one neighbor w such that v and w receive different colors. Given a graph G, the 
weak c-coloring problem for G is to weak c-color the nodes of G. 
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In the context of tiling, to solve the weak c-coloring problem for an n x n 
surface means tiling the surface so each tile has at least one neighbor (to the 
north, south, east or west) of a different color. In the next section, we will 
present a simple solution to the weak c-coloring problem in the abstract Tile 
Assembly Model. By contrast, Naor and Stockmeyer showed that no local, 
constant-time algorithm can solve the weak c-coloring problem for grid graphs, 
nor for /c-dimensional meshes, a generalization of grid graphs which we now 
define. 

Definition 21. A /c-dimensional mesh is a graph with vertex set {0, 1, . . . , m}*^ 
for some m, such that two vertices are connected by an edge if the Ci-distance 
between them is 1. 

Theorem 2 (Naor and Stockmeyer |12j'). For any natural numbers c, k and 
t, there is no local algorithm with time bound t that solves the weak c-coloring 
problem for the class of k- dimensional meshes. (This remains true even if the 
processors have unique identifiers and can transmit them as part of the local 
algorithm.) 

A second theorem from the same paper says that randomization does not 
help. The original result is stronger than the formulation here. 

Theorem 3 (Naor and Stockmeyer |l2j). Fix a class Q of graphs closed under 
disjoint union. If there is a randomized local algorithm P with time bound t that 
solves the weak c-coloring problem for Q with error probability e for some e < 1, 
then there is a deterministic local algorithm A with time bound t that solves the 
weak c-coloring problem for Q. 

4 Simulation of Self- Assembly on a Surface 

In order to apply the theorems of Naor and Stockmeyer to the realm of self- 
assembly, we build a distributed network of processors that reduces a self- 
assembly problem to a distributed computing problem. The motivating in- 
tuition is that each processor simulates a location of the surface, and reports to 
its neighbors whether there is a (simulated) agent at that location. Formally, 
we prove the following theorem. 

Theorem 4. Let M be a model of d-regular self-assembling agents for any 
natural number d > 0, such that M. self- assembles on a k-dimensional mesh of 
size , and that A4 allows multiple nucleation. Then there is a model Af of 
distributed computing that simulates Ai using processors with the network 
topology of a k-dimensional mesh, and constant-size message complexity. 

Proof. Fix a model of c?-regular self-assembling agents M. as in the theorem 
statement. Let a be a configuration of agents on the mesh of size . Let F be 
the set of glue types of Ai and M the set of electronic messages of M . (Both 
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r and M are finite sets.) The definition of the binding function /3 induces a 
function 

/3 : (T U {0}) X (r U {0})'* x (M U {0})'^ x (T U {0}) [0, 1] 

such that /3 takes as input the (possibly empty) agent type aijn) at some loca- 
tion m in configuration a, and, based on the glue types and electronic messages 
received from the d neighbors that could be incident to an agent at m. re- 
turns, for each agent type, the probability that a{m) would contain that agent 
type, over the space of all legal A^-assembly sequences that start with con- 
figuration a and run for one time step. In particular, for fixed to G T, fixed 
7i, • • • ) 7d € r U {0}, and fixed mi, ... , € M U {0}, it is true that 

teTu{0} 

We have not formally defined A^-assembly sequences, but they are a natural 
extension of the r-T-assembly sequences of tile self-assembly, where the /? and 
T oi Ad arc used to determine whether agents bind stably to one another. Also, 
if an agent type lies on the edge of the n'^-size surface, so it does not have a 
neighbor in a particular direction, we define $ so that the empty set is the glue 
and electronic message "transmitted" from the "neighbor" in that direction. 

We simulate assembly sequences of M. on an A;- dimensional mesh where each 
of the dimensions has length n by a network of processors M whose network 
graph is also a fc-dimensional mesh of total size n!^ . Each processor will simulate 
the presence or absence of an agent in the same location on the assembly surface. 
We interpret bonds between two agents as messages. We add on top of those 
messages, an additional set of electronic messages agents can send neighbors, and 
encode the combination as an ordered pair: glue type and electronic message. 
The function /3 will be the probabilistic transition function for processors in this 
system. 

Processors of M axe of the following form. 
Processor pi 

ci-many input message buffers: inbuf^ i, . . . , inbuf^ d- 

d-many output message buffers: outbuf, i, . . . ,outbuf, rf. 

A color Vciriable: COLORj, a variable that can take a value from {!,... ,c}, 
where c is a global constant. 

A local state: Each processor is in one of |T| + 1 different local states q during 
a given execution stage s. There is one stage qk to simulate each agent 
type tk G r, and an additional stage EMPTY, to simulate the absence of 
an agent from the surface location that pi is simulating. 

A state transition function: This function takes the current processor state 
and the messages received in the current round, and probabilistically di- 
rects what state the processor will adopt in the next round. 
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The messages processors send on the network are of form (glue type, elec- 
tronic message). The input message buffers of processor pi simulate the glue 
types of the edges the agent at pi's location is adjacent to, and the electronic 
messages (if any) received from an agent's neighbors. The output message 
buffers of Pi simulate the glues on the edges of the tile pi is simulating, and 
the electronic messages the agent transmits to its neighbors. The purpose of 
COLOR; is to simulate the color of the agent placed at the location simulated 
by Pf 

All processors in Af are hardcoded with the same probabilistic state transi- 
tion function, which is determined from the definition of f3 (which we induced 
above from the properties of M), in the natural way: if, in round r of the 
algorithm execution, pi is in state q^, a simulation of tk G T, and hears mes- 
sages that simulate glue types gi,. . . ,gd and electronic messages mi, . . . ,md, 
then at the end of round r, it will transition to state qj with probability tTj, 
where l3{tk, gi, . . . ,gd, mi, . . . , m^, tj) = ttj and each tj is a distinct element of 
TU {0}. As explained above, we denote the state that simulates the "presence 
of the empty set" — i.e., the absence of any agent from the location simulated 
by p^--as EMPTY. 

To simulate the process of self-assembly, we run the following distributed 
algorithm on J\f. 

Algorithm execution proceeds in synchronized rounds. Before execution be- 
gins, all processors start in state EMPTY. In round r — 0, (through the inter- 
vention of an omniscient operator) each processor in the locations corresponding 
to the seed assembly enters the stage to simulate the agent type at that location 
in the seed assembly. 

Also in round r = 0, each processor not simulating part of the seed assem- 
bly "wakes up" (enters a state other than EMPTY) with probability tTj^, the 
multiple nucleation probability of A^. If a processor wakes up, it enters state 
q EMPTY, chosen uniformly at random from the set of non-EMPTY states. 
For any round r > 0, each processor runs either Algorithm 1 or Algorithm 2, 
depending on whether it is in state EMPTY. 

The interaction between agents in A4 is completely defined by the glues 
and electronic messages of an agent's immediate neighbors, as specified in the 
function (3 and the algorithm of each agent type. The processors of Af simulate 
that behavior with Algorithm 2. Since the processors of Af simulate empty 
locations with Algorithm 1, by a straightforward induction argument, Af can 
simulate all possible Al-assembly sequences, and the theorem is proved. □ 

We obtain our time lower bound results as corollaries of Theorem ID 

Corollary 1. If the (deterministic or probabilistic) binding rules of a multiply 
nucleating tile assembly system T are entirely local, then T is unable to solve 
the weak c-coloring problem in constant time. 

Proof. Suppose T is an irreversible tiling model. If T can weak c-color surfaces 
in constant time, then there is a deterministic algorithm for the distributed 
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Algorithm 1 For pi in state EMPTY at round r 
if r = then 

wake up with probability n^, and cease execution for this round, 
end if 

if r > then 

Read the d-many input buffers, 
if no messages were received then 

cease execution for the round 
else 

let qo be the state change obtained according to probabilities (3 assigns 
to the space ru{0}, for a location that has adjacent glue types and elec- 
tronic messages that are simulated by the messages received this round. 

Send the messages indicated by state go and the behavior of A. 
Set the value of COLORj according to qo- 
Enter state go and cease execution for this round, 
end if 
end if 



Algorithm 2 For pi in state q ^ EMPTY (at any round) 
Read the four input buffers, 
if no messages were received then 

Send the messages indicated by state q and the behavior of A and cease 

execution for this round, 
else 

Let go be the state change obtained probabilistically, based on the proba- 
bilities produced by the function to the space TU {0}, given input from 
the glue types and electronic messages simulated by the messages received 
this round. 

Send the messages indicated by state go. 
Set the value of COLORj according to go. 
Enter state go and cease execution for this round, 
end if 
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Figure 2: The tileset T* used in the proof of Proposition 1. 

network Af that weak c-colors JV locally, and in constant time. By Theorem |2] 
that is impossible. 

So assume T is a reversible tiling model, and when T assembles, it weak 
c-colors the tiling surface, and achieves bond pair equilibrium in constant time. 
Then there is a local probabilistic algorithm for J\f that weak c-colors Af in 
constant time, with positive probability of success. By Theorem |3] that is im- 
possible as well. Therefore, no T exists that weak c-colors surfaces in constant 
time. □ 

By a similar argument, we obtain a lower bound for active self-assembling 
agents on a three-dimensional cubic grid. 

Corollary 2. // a model of 6-regular self- assembling agents has only local bind- 
ing rules, then it cannot solve the weak c-coloring problem in constant time on 
a 3-dimensional mesh, for any value of c. 

A physical interpretation of Corollary |2] would be that robots self-assembling 
in three-space (i.e., fc = 3 and, in this example, d = 6 so there are six arms 
coming off each robot, orthogonally to one another) cannot achieve speedup to 
constant time by self-assembling in separate groups and then joining the groups 
together. This lower bound remains in effect even if the robots are designed by 
a method that assigns each robot a unique identifier. 

We conclude this section by noting that the weak c-coloring problem has 
low tile complexity — that is, can be defined using only a few local rules — in the 
aTAM. 
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Proposition 1. There is a tile assembly system in the abstract Tile Assembly 
Model that weak c-colors the first quadrant, using only seven distinct tile types. 

Proof. Figure 2 exhibits a tileset T* of seven tile types that assembles into a 
weak c-coloring of the first quadrant, starting from an individual seed tile placed 
at the origin. One can verify by inspection that T* is locally deterministic, so 
it will always produce the same terminal assembly. All assembly sequences 
generated by T* produce a checkerboard pattern in which a monochromatic 
"+" configuration never appears. Hence, it solves the weak c-coloring problem 
for the entire first quadrant, and also for all n x n squares, for any n. □ 

One can define a three-dimensional version of the tileset T* (shown in Fig- 
ure [2]) in the natural way, using for example the 3D tile assembly model in [5] . 
Such a three-dimensional tileset will weakly c-color the three-dimensional mesh 
where d = 6, with low tile complexity. 

5 Conclusion 

In this paper, we showed that if a tile assembly model has only local binding 
rules, then it cannot use multiple nucleation on a surface to solve locally check- 
able labeling problems in constant time, even though the abstract Tile Assembly 
Model can solve a locally checkable labeling problem using just seven tile types. 
In fact, we proved a more general impossibility result, which showed the same 
lower bound applies to self-assembling agents in a three-dimensional grid that 
are capable of binding and subsequently sending messages to their neighbors. 
To the best of our knowledge, this was the first application of a distributed 
computing impossibility result to the field of self-assembly. 

There are still many open questions regarding multiple nucleation. Aggarwal 
et al. asked in [3] whether multiple nucleation might reduce the tile complexity 
of finite shapes. The answer is not known. Furthermore, we can ask for what 
class of computational problems does there exist some function / such that 
we could tile an 71 x n square in time 0(1) < 0{f) < 0{n'^), and "solve" 
the problem with "acceptable" probability of error, in a tile assembly model 
that permits multiple nucleation. It would also be interesting to explore the 
possibility of modeling multiple nucleation of molecules floating in solution — 
instead of adhering to a surface — perhaps by using techniques from the field of 
ad hoc wireless networks. 

We hope that this is just the start of a conversation between researchers in 
the fields of distributed computing and biomolecular computation. 
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